Published online 14 February 2013 



Nucleic Acids Research, 2013, Vol. 41, No. 7 e87 

doi:10.1093/nar/gkt081 



High-resolution nucleosome mapping of targeted 
regions using BAC-based enrichment 

Erbay Yigit 1 , Quanwei Zhang 2 , Liqun Xi 2 , Dan Grilley 1 , Jonathan Widom 1 v , 
Ji-Ping Wang 2 '*, Anjana Rao 3 ' 4 and Matthew E. Pipkin 3 ' 4 '* 

department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA, 2 Department of 
Statistics, Northwestern University, Evanston, IL 60208, USA, 3 Division of Signaling and Gene Expression, The 
La Jolla Institute for Allergy and Immunology, La Jolla, CA 92037, USA and 4 Program in Cellular and Molecular 
Medicine, Children's Hospital and Harvard Medical School, Boston, MA 02115, USA 

Received October 29, 2012; Revised December 23, 2012; Accepted January 22, 2013 



ABSTRACT 

We report a target enrichment method to map 
nucleosomes of large genomes at unprecedented 
coverage and resolution by deeply sequencing 
locus-specific mononucleosomal DNA enriched via 
hybridization with bacterial artificial chromosomes. 
We achieved ~1 0000-fold enrichment of specific 
loci, which enabled sequencing nucleosomes at up 
to ~500-fold higher coverage than has been 
reported in a mammalian genome. We demonstrate 
the advantages of generating high-sequencing 
coverage for mapping the center of discrete nucleo- 
somes, and we show the use of the method by 
mapping nucleosomes during T cell differentiation 
using nuclei from effector T-cells differentiated 
from clonal, isogenic, naive, primary murine CD4 
and CD8 T lymphocytes. The analysis reveals that 
discrete nucleosomes exhibit cell type-specific oc- 
cupancy and positioning depending on differenti- 
ation status and transcription. This method is 
widely applicable to mapping many features of chro- 
matin and discerning its landscape in large 
genomes at unprecedented resolution. 

INTRODUCTION 

Nucleosomes are the fundamental repeating subunits of 
chromatin, and they consist of an octamer of histone 
proteins (H2A, H2B, H3 and H4) around which 
~147bp of DNA is tightly wrapped. Adjacent nucleo- 
somes are spaced by a short ~20-50-bp segment of 
'linker' DNA (1,2). Binding sites for sequence-specific 



DNA-binding factors are less accessible when the DNA 
containing the sites is wrapped in nucleosomes. In turn, 
this regulated accessibility influences the ability of tran- 
scription factors to recruit and mobilize the enzymatic 
machinery of replication, repair, recombination and 
transcription. 

The rules that govern nucleosome organization in vivo 
are under intense scrutiny (3-6). Deciphering these rules 
depends on methods to measure nucleosome occupancy 
across the genome, to assign their center locations at 
high-resolution in vivo and to quantify how they are re- 
modeled during cell activation and differentiation. The 
most widely used approach involves treating native 
chromatin with micrococcal nuclease (MNase), isolating 
DNA protected from digestion by the histone octamer and 
subjecting the protected DNA to high-throughput 
sequencing. Computational methods are used to calculate 
nucleosome maps based on an occupancy curve resulting 
from alignment of the sequenced reads to a genome (7,8). 
Much of our understanding of nucleosome positioning 
in vivo derives from model organisms, such as yeast, in 
which adequate sequencing coverage to map nucleosomes 
genome-wide at high-resolution can be achieved econom- 
ically because they have relatively small genomes (~250 
times smaller than humans and mice). But these model 
organisms typically lack the complex structure encoded 
in mammalian genomes. Most nucleosomes have not 
been mapped in mammalian genomes at high resolution 
because the cost to achieve sufficient whole-genome 
sequencing coverage of nucleosomal DNA in such large 
genomes remains prohibitively expensive, and only a small 
fraction of nucleosomes seem to be strongly positioned 
(6). As a result, much of what is known from whole- 
genome sequencing of human nucleosomal DNA derives 
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from the minority of nucleosomes that are strongly 
positioned and patterns of nucleosome occupancy that 
result after summarizing all nucleosome-derived reads 
from regions surrounding aligned structural features 
(e.g. transcriptional start sites or DNase I hypersensitive 
sites) in bins of genes, grouped based on their mRNA 
expression (6,9). Thus, nucleosome organization of mam- 
malian genomes has mainly been defined by the average 
ensemble of nucleosome occupancy from many loci, rather 
than high-resolution knowledge of discrete nucleosomes 
within individual genes. 

We envisioned target enrichment of mononucleosomal 
DNA as a strategy to achieve high enough sequencing 
coverage of nucleosome DNA to map discrete nucleo- 
somes and to focus on genomic regions that are likely to 
contain nucleosomes that undergo remodeling during cell 
differentiation. In this report, we developed a method to 
enrich mononucleosomal DNA from specific loci using 
hybridization to bacterial artificial chromosomes 
(BACs), by modifying a previously reported direct 
genomic selection method (10), introducing a multiplexing 
step and adapting the method to massive parallel 
sequencing technologies. We call this approach BEM-seq 
(bacterial artificial chromosomes enriched mono-nucleo- 
somal DNA sequencing). 

MATERIALS AND METHODS 

BAC DNA isolation and labeling 

Appropriate BAC clones were identified via the UCSC 
genome browser and were purchased from Children's 
Hospital of Oakland Research Institute. BAC DNA was 
isolated from 1 1 of culture, purified by chromatography 
(Qiagen) and analyzed for identity and quantity by restric- 
tion enzyme mapping using pulse-field gel electrophoresis 
(PFGE) and spectrophotometry as described previously 
(11). All BAC preparations used for enrichment were con- 
firmed to be >80% intact BAC DNA by PFGE. For each 
hybridization reaction, BAC DNA was labeled with a 
commercial nick translation kit (Roche cat. 10976 776 
001). The reaction (total 20 ul) was set-up in a polymerase 
chain reaction (PCR) tube on ice as follows: 120ng of 
BAC DNA, 5 ul of Biotin-dUTP/dNTP mixture, 2 ul of 
lOx nick translation buffer, 0.5 ul of a- 32 P [dCTP] 
(6000 Ci/mmol) and 2 ul of enzyme mixture. Biotin- 
dUTP/dNTP mixture: Biotin-16-dUTP (0.8 mM) 0.6 
volume, dTTP (0.8 mM) 2.4 volume, dATP (0.8 mM) 3.0 
volume, dCTP (0.8 mM) 3.0 volume and dGTP (0.8 mM) 
3.0 volume. Nick translation was carried out at 15°C for 
1 h in a thermal cycler. The reaction was stopped by 
addition of ethylenediaminetetraacetic acid (EDTA). The 
labeled BAC DNA was purified by size-exclusion chroma- 
tography (Bio-Rad, Micro Bio-Spin P-30). Biotinylated- 
dUTP (Enzo Life Sciences) incorporation was confirmed 
indirectly based on radioactivity: About 10% of the 
labeled BAC DNA was incubated with magnetic strep- 
tavidin beads in streptavidin-binding buffer [10 mM 
Tris-HCl (pH7.5), 1 mM EDTA and 1 M NaCl] at room 
temperature, and labeled DNA was collected on a magnet. 
The relative amount of radioactivity in biotinylated DNA 



captured with the magnetic beads versus that left in the 
supernatant was measured using a Geiger counter. If at 
least four times greater radioactivity was found in the 
beads relative to the supernatant, the BAC DNA was con- 
sidered to be successfully biotinylated. 

Mice, cells and T-cell differentiation 

Naive CD4 T and CD8 T cells were isolated from 6- to 16- 
week-old OT-II and P14 T-cell receptor (TCR) transgenic 
(Tg) mice, respectively, that were maintained on the 
Tcra~ l ~ background to eliminate T cells co-expressing en- 
dogenously rearranged TCRs. All mice were maintained 
in specific pathogen-free facilities and used according to 
protocols approved by the Immune Disease Institute and 
Harvard Medical School and La Jolla Institute for Allergy 
and Immunology animal care and use committees. T cells 
were isolated from single cell suspensions prepared from 
pooled spleen and lymph nodes, by positive (CD4 cells) or 
negative (CD8 cells) selection (Dynal). For T-cell differ- 
entiation, 1 x 10 7 naive OT-II CD4 T cells, or naive P14 
CD8 T cells, were activated for 48 h in 10 ml complete T- 
cell media supplemented with hamster aCD3 (2C11) and 
aCD28 (37.51) each at 1 ug/ml, in T25 flasks pre-coated 
with goat anti-hamster capture antibody. For T helper 
type 1 (Thl) cultures, media was supplemented with inter- 
leukin (IL)-12 (lOng/ml) and aIL-4 (11B11; 10ug/ml); for 
T helper type 2 (Th2) cultures, media was supplemented 
with IL-4 (supernatant, -1000 U/ml), alFNy (XMG1.2; 
5ug/ml) and odL-12 (3ug/ml). After 48 h, Thl and Th2 
cultures were resuspended, counted and diluted to 5 x 10 5 
cells/ml in fresh Thl or Th2 media supplemented with 
20 U/ml recombinant human IL-2 (rhIL-2; NIH bio- 
resource). Thl and Th2 cultures were expanded every 
24 h by counting and readjusting cells to 5 x 10 5 cells/ml, 
using Thl or Th2 media with rhIL-2 until day 4, and 
thereafter using media containing only rhIL-2. After the 
initial 48 h of stimulation, CD8 T cells were removed from 
the TCR stimulation and diluted to 5 x 10 5 cells/ml in 
fresh media containing 100 U/ml of rhIL-2 and were 
counted and readjusted to 5 x 10 5 cells/ml 100 U/ml of 
rhIL-2 media every 24 h (12). The differentiation status 
of T-cell cultures was monitored by flow cytometry 
using labeled antibodies recognizing CD4 (RM4-5), 
CD8a (53-6.7), CD25 (PC61) CD44 (IM7), CD62L 
(Mel-14), IL-7Ra (A7R34), T-bet (4B10) and Eomes 
(DanllMag) (Figure 1A and data not shown). Cytokine 
gene transcription was elicited pharmacologically by 
stimulating Thl and Th2 cells at a density of 1 x 10 6 /ml 
in the presence of 10 nM phorbol mystirate acetate (PMA) 
and 1 uM ionomycin for 6 h; brefeldin A was included for 
the final 2h of stimulation. Re-stimulated cells were har- 
vested, washed in phosphate-buffered saline (PBS), fixed 
in 2% paraformaldehyde, washed again in PBS and then 
permeabilized with 0.25% saponin in PBS supplemented 
with 1.0% Fetal Bovine Serum (FBS) (saponin-staining 
buffer). Permeabilzed cells were stained in saponin- 
staining buffer using labeled aIL-4 (11B11) and alFNy 
(XMG1.2) antibodies. Stained cells were washed in 
saponin-staining buffer before finally re-suspending PBS/ 
1.0% FBS for analysis (data not shown). 
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Nuclei isolation and MNase digestion 

Nuclei were prepared as previously described (11,13), with 
the following modifications detailed later in the text, 
although other methods to digest native chromatin that 
require less handling time are likely to work as well are 
known (14), and they might offer some advantages for 
mapping nucleosomes accurately. Exponentially growing 
lymphocyte cultures were harvested in a chilled centrifuge 
(4°C), washed, concentrated and adjusted to 2 x 10 7 cells/ 
ml in ice cold Ca 2+ /Mg 2+ free PBS. One volume of cells 
was lysed on ice for 10 min by adding four volumes of lysis 
buffer [12.5 mM Tris (pH 7.4), 45 mM KC1, 6.25 mM 
MgCl 2 , 375 mM sucrose and 0.125% nonidet P-40, sup- 
plemented with 2mM benzaminidine and one complete 
EDTA-free protease inhibitor cocktail/50 ml]. Nuclei 
were pelleted at 600g for 7 min in a pre-cooled swinging 
bucket rotor at 4°C. Pelleted nuclei were resuspended in 
nuclei wash buffer [10 mM Tris (pH 7.4), 60 mM KC1, 
15mM NaCl, 5mM MgCl 2 and 300 mM sucrose] using 
one-tenth volume relative to the initial volume at lysis. 
Resuspended pellets were pooled 2:1 to concentrate, and 
the final wash volume was adjusted to half the initial lysis 
volume with nuclei wash buffer. The nuclei were pelleted 
again as above, and supernatants were completely 
aspirated. For MNase (Sigma) digestion, pelleted nuclei 
were resuspended in 1 ml MNase buffer [10 mM Tris 
(pH 7.4), 15mM NaCl, 60 mM KC1, 150uM spermine 
and 500 uM spermidine] and then were adjusted to 
2 x 10 7 nuclei/ml in MNase buffer, on ice. MNase 
digestion was performed at 37°C in the presence of 
1 mM CaCl 2 , and the incubation time was titrated to 
identify optimal digestion for each nuclei preparation. 
Incubation times resulting in chromatin digested into 
~80% 147-bp fragments (core nucleosome) that were 
clearly resolved from 167-bp fragments (chromatosome) 
with limited evidence of intranucleosomal cleavage, were 
selected for a large preparative digestion. The digestion 
was stopped with a final concentration of lOmM EDTA 
and 1% sodium dodecyl sulfate (SDS), RNase A (20 ug/ 
ml) was added and the lysates were incubated at 37°C for 
30 min. The final concentration of NaCl was increased to 
0.5 M, Proteinase K was added to a final concentration of 
50ug/ml and the lysates were incubated at 55°C for at 
least 1 h. DNA was extracted with phenol/chloro- 
form, precipitated with ethanol and resuspended in TE 
buffer. 

Gel extraction and purification of mononucleosomal DNA 

Digested chromatin DNA was resolved on 3.3% agarose 
gels (NusieveGTG, Lonza) in lx TBE buffer until 147- 
and 167-bp DNA bands were clearly resolved. The 147-bp 
DNA band was carefully excised from the agarose gel, 
crushed and soaked in a buffer containing 300 mM 
sodium acetate, 1 mM EDTA and 0.05% SDS for 
24-48 h at room temperature with gentle agitation on a 
rocker. Agarose was removed from the crush and soak 
buffer by Amicon Ultrafree-CL filter (Millipore), and fil- 
trates were concentrated with Amicon- 15 Ultra filter 
devices (30kDa cut-off; Millipore) and were purified by 
QIAquick Spin Columns using buffer PB. Commercial gel 



extraction kits containing strong chaotropic reagents that 
melt AT rich sequences easily were found to substantially 
denature a large fraction of mononucleosomal DNA (data 
not shown), and thus were not used to reduce potential 
bias towards nucleosomal DNA with high GC content 
(15). 

Preparation of SOLiD library 

About 6 ug DNA was treated with a commercial DNA 
end-repair enzyme cocktail (Lucigene Cat No 40037-2) 
to make DNA ends ligatable. Repaired DNA ends were 
ligated to SOLiD PI and P2 adaptors with the Quick 
Ligase Kit (New England Biolabs cat. no M2200L). For 
some of the samples, repaired DNA ends were ligated to 
custom barcoded SOLiD PI adaptors (see Table 2) for 
multiplexing. Ligation products were separated on 6% 
native polyacrylamide gels, excised, extracted using the 
crush and soak method and purified. The gaps on 
unligated strands were repaired by DNA polymerase I 
(New England Biolabs cat. no M0209L,) in the presence 
of dNTPs at 16°C for 30 min. After each aforementioned 
step, the DNA was purified using QIAquick PCR 
Purification Kit (Qiagen). DNA concentration in the 
libraries was determined by absorbance at 260 nm in TE 
buffer using a Nanodrop (ThermoFisher Scientific). 

Enrichment of targets by BAC hybridization 

Labeled BAC DNA was transferred to a 200-ul PCR tube 
and lyophilized using a speedvac without applying any 
heat. To suppress repeats during hybridization, DNA 
pellets were resuspended in 5 ul of TE buffer containing 
10 ug of mouse cot-1 DNA (Invitrogen) and overlaid with 
mineral oil. Labeled BAC DNA was denatured at 96°C 
for 5 min and incubated at 65°C for an additional 15 min 
in a thermal cycler, and then 5ul of 2x hybridization 
buffer [1.5 M NaCl, 40mM sodium phosphate buffer 
(pH 7.2), lOmM EDTA (pH 8), lOx Denhardt's and 
0.2% SDS] was added to the DNA, and the mixture was 
incubated for 6h at 65°C. In a separate 200-ul PCR tube, 
5 ul (1-2 |a.g) of nucleosome library DNA was denatured at 
96°C for 5 min, and then incubated for an additional 
15 min at 65°C in a thermal cycler, and then 5ul of 2x 
hybridization buffer was added to the denatured library 
DNA. The entire volume of denatured nucleosome library 
DNA was added to the cot-1 suppressed labeled BAC 
DNA, and the mixture was incubated at 65° C for ~72- 
80 h. Fifty microliters of streptavidin-coated magnetic 
beads (Invitrogen) was washed with binding buffer and 
resuspended in 150ul streptavidin-binding buffer in a 
low-retention microcentrifuge tube (Eppendorf cat. no 
022 431 021). The hybridization mix was directly added 
to the pre-washed magnetic beads, and binding was 
carried out on a rotator at room temperature for 30 
min. The magnetic beads were washed once at 25°C for 
15 min in lx SSC with 0.1% SDS and three times at 65°C 
for 15 min in 0.1 x SSC with 0.1% SDS. All the washes 
were carried out in 1 ml of wash buffer on a heat block in a 
microcentrifuge tube with occasional mixing. After washes 
were completed, the hybridized nucleosomal DNA was 
denatured from the BAC hybrids by adding lOOmM 
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Table 1. Quantification of BAC-based target enrichment of mononucleosomal DNA using colony sequencing 



Cell type 



BAC 



Size 



Number of 
on-target clones 11 



Total number 
of clones 



On-target 
clones (%) 



Fold-enrichment 



CTL 

Thl 

Th2 

b Seven different 
cell types 



Prfl 
Prfl 
Prfl 

Nine different 
BAC clones 



224 kb 
224 kb 
224 kb 
1.87Mb 



67 
24 
25 
27 



95 
30 
30 
45 



71 

SO 
83 
60 



8600 
9700 
10 000 
810° 



Fold-enrichment = (number of on-target clones/number of total sequenced clones) x [the size of mouse Genome (bp)/the size of BAC clone (bp)]. 
"On-target clones = clones mapping to target BAC used for enrichment. 

b Nucleosome DNA from seven different cell types were barcoded during library preparation, and each library was enriched by hybridization with a 
mixture of nine different BAC clones. The enriched DNA was cloned and sequenced using standard methods. Of 45 sequenced clones, 27 matched 
regions in the nine BACs used for selection. 

The average size of one BAC within the nine used for multiplexed selection was 207.5 kb, corresponding to an average of 7300-fold enrichment per 
BAC clone. 



NaOH at room temperature for lOmin, and the magnetic 
beads were removed from the elution solution at room 
temperature. The elution solution was transferred to a 
new microcentrifuge tube, neutralized with 100 ul of 1 M 
Tris (pH 7.5) buffer and desalted by a size-exclusion 
column (Bio-Rad P-30 column). DNA was amplified by 
PCR (15-18 cycle) using PI and P2 PCR primers, gel 
extracted and submitted for SOLiD 4 sequencing. To 
quantify target enrichment, we blunt-end cloned an 
aliquot of the BAC-enriched PCR products from each 
enrichment (Invitrogen ZeroBlunt PCR cloning kit) and 
sequenced the inserts from antibiotic-resistant colonies 
using standard methods (Table 1). Enrichment was 
calculated as follows: Fold-enrichment = [genome size 
(bp)/BAC size (bp)] x (number of sequenced clones 
aligned to target BAC)/(total number of sequenced clones). 

Calculation of center-weighted nucleosome 
occupancy score 

For the single-end data, the average reads length is 
unknown. To construct the nucleosome reads occupancy, 
we need to know where the projected nucleosome center is 
based on each tag. However, the average reads length in 
different BACs or different regions on the same BAC may 
differ. Without adjustment of the average reads length, the 
correlation analysis (described later in the text) of the nu- 
cleosome occupancy can be biased. Suppose position i is 
the nucleosome center, then the reads on the Watson 
strand originating from this nucleosome may start from 
i-k, and the reads from Crick strand may start from i + k 
for some integer k. For practical purpose, we only 
consider Watson tags that reside 63-83-nt upstream and 
Crick tags 63-83-nt downstream, or equivalently, the 
complete reads length could vary between 127 and 
167 bp if both ends of the DNA insert had been 
sequenced. Let iv, and c,- be the observed tag frequency 
at position i on Watson and Crick strands, respectively. 
For any single-end tag on Watson position i, as its 
paired-end on Crick is unknown, the tag frequency in 
the region from position /'+ 1 26 to z+166 on the Crick 
strand, i.e. c,+i26, c ( '+i66, can be used to estimate the 
probability of that the paired-end start at the correspond- 
ing positions. Thus, if we observe ir ; tags at position i on 
the Watson strand, then the projected centers of the 



nucleosomes attributed to the w t tags are positions 
;+63, Z+63.5, ;+83 with corresponding reads frequency 
weight WjCi+k/ J2jL)+ 6 i26 c i f° r ^ = 126, 166.. The weight 
at half positions will be split into halves and added to 
neighboring integer positions. Likewise, for c, tags at 
position i on the Crick strand, the projected nucleosome 
center positions are i — 63, i — 63.5, i — 83 with weights 

CiWi-k/T^-m"! for *= 126, ....,166. Combing 
both strands, the expected reads frequency that is 
centered at position i (if both ends had been sequenced) 
can be calculated as 

83 /-A+166 

Ri = J2 Wi-k(c i+k / °j) 

A -63 y=/-A+126 

82 i— /fc+166 

+0.5 ^ Wi- k (c i+k+ l/ ^ c j) 
A=63 /=/-/t+126 
82 ;-A+165 

+0.5 ]T Wi- k -\{Ci+kl ^2 C j) 
A-63 ;=;-A+125 

(1) 

83 J+A--166 v ' 

+ C i+k(. w >~k/ J2 W d 
A-63 j=i+k-l26 
82 ;+A--166 

+0.5 ^ Ci+ k (Wi-k-ll ^ W J^ 
A-63 j=i+k-U6 
82 i+A-165 

+0.5 c i+k+ i(Wi-k/ ^2 w l) 

A=63 j=i+k-\25 

We define the center-weighted nucleosome occupancy 
score at position i as follows: 

0i = J2 W- 7 ;20), (2) 

IMI<60 

where G stands for a Gaussian density with SD = 20. The 
Gaussian weight function covers ±60 bp of the projected 
nucleosome center and decays towards two ends. The ad- 
vantage of using this weighting function is to avoid spikes 
in the reads occupancy curve frequently observed in the 
linker region because of overlap of reads arising from two 
neighboring nucleosomes when a uniform weight function 
is applied. 
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For the paired-end data, the number of reads that are 
centered at each position is directly observed. Therefore, 
Equation (2) can be readily applied to calculate the 
center-weighted nucleosome occupancy. 

Nucleosome center DNA alignment 

Based on the center-weighted occupancy scores of the 
paired-end data, we selected the strongest peaks sequen- 
tially based on peak height and peak steepness, requiring 
that two selected neighboring peaks be spaced by at least 
120 bp. The selected peak positions were treated as the 
nucleosome centers to examine the nucleosome-specific 
dinucleotide signal (e.g. the AA/TT/TA/AT frequency). 
The selected peaks represent the approximate center pos- 
itions of the nucleosomes, but the precision could be 
undermined by the well-known sequence preference of 
MNase. In addition, we searched for a sequencing read 
of length 147 bp nearest to the peak position in the ± 5-bp 
region because reads with length of ~ 147 bp tend to be 
better representatives of the true nucleosome positions 
compared with reads that are much shorter or longer 
than 147 bp. If no such sequence existed, we further 
searched for sequences of lengths 148, 146, 149, 145 and 
1 50 bp sequentially within ± 5-bp region of the peak until 
the first sequence was identified. The center of the 
identified sequence was treated as the nucleosome center 
to generate the AA/TT/TA/AT frequency plot. If no such 
sequence was identified in the ±5-bp region, the peak 
position itself was treated as the true nucleosome center 
to generate the alignment. 

Correlation analysis 

Let Ok — {o\k,02k,--,o„k}, k — 1, K be the center- 
weighted nucleosome occupancy score series in a 
genomic region of length n for condition 1 through K as 
calculated previously. We calculated the pairwise 
Spearman rank-based correlation between any two condi- 
tions for a given genomic region. 

RESULTS 

Generation of mononucleosome DNA libraries for NG 
sequencing and BAC enrichment 

To map mammalian nucleosomes, we used a series of 
well-established culture systems that drive the differenti- 
ation of naive T cells into defined effector T-cells subsets, 
a process that is associated with pronounced chromatin 
and transcriptional changes (11,16). Thus, we purified 
clonal populations of naive CD4 and CD8 T cells 
(Figure 1A) from inbred TCR Tg mice and differentiated 
these precursor cells into Thl and Th2 cells, and Cytotoxic 
T lymphocyte (CTL), respectively (Figure IB). Compared 
with purified T cells from human donors that have been 
used for nucleosome mapping (6,9), which are polyclonal, 
genetically disparate and notoriously heterogeneous func- 
tionally and phenotypically (17), purified mouse T-cells 
differentiated in culture are likely to have the least biolo- 
gical variability in nucleosome organization between indi- 
vidual cells (and alleles) within the population because 



they are clonal, isogenic and strongly polarized towards 
a specific phenotype. Thl, Th2 and CTL cells each differ- 
entially remodel chromatin structure in the Ifng, 114 and 
Prfl loci, and this underlies the specific transcription of 
each gene in the different cell types (Figure IB). We 
isolated nuclei from each of the three cell types and 
incubated them with MNase for increasing amounts of 
time to exhaustively cleave linker DNA and produce frag- 
ments protected by the histone octamer; DNA was ex- 
tracted from digested chromatin and was carefully 
resolved on 3.3% low-melting point lx TBE agarose 
gels (Figure 1C). Conditions that resulted predominantly 
in canonical mononuclesomal DNA fragments (i.e. 
147 bp) with limited evidence of intranucleosomal 
cleavage were selected for library preparation (Figure 

IC) . Mononucleosome DNA fragments were excised 
from the gel, purified, end-repaired and ligated to 
adaptors compatible with Applied Biosystems SOLiD 
sequencing to generate mononucleosome DNA libraries 
(Figure ID). In some cases, we used PI adapters 
customized with 6-bp barcodes to sequence different 
samples simultaneously as a mixture (see 'Materials and 
Methods' section). Notably, both careful attention to the 
extent of MNase digestion and the subsequent size selec- 
tion step reduced recovery of smaller DNA fragments 
derived from protection by other non-histone 
DNA-binding proteins, such as transcription factors and 
chromatin remodeling factors, or particularly unstable nu- 
cleosomes, which could otherwise complicate subsequent 
computation of nucleosome maps (18). To enrich nucleo- 
somal DNA from targeted loci, we hybridized the linkered 
libraries with individual biotinylated BAC clones covering 
the 114, Ifng and Prfl loci in separate reactions (Figure 

ID) . Mononucleosomal DNA annealed to the BACs 
was captured using streptavidin beads; non-specific 
DNA was washed away; and the captured nucleosomal 
DNA was eluted from the biotinylated BACs by alkali 
denaturation, then purified and subjected to 15-18 cycles 
of PCR with adaptor-specific SOLiD PI- and P2-specific 
primers (Figure ID). We quantified enrichments by 
cloning the enriched mononucleosome libraries, 
sequencing inserts from 30 to 100 clones using standard 
methods and aligning the sequences with the mouse 
genome (Figure IE). Based on 10 independent enrichment 
experiments using BAC clones spanning multiple different 
loci, we found that an average of 65% of clones mapped 
to the corresponding BAC regions used for selective 
capture, which equated to >8600-fold enrichment for a 
224 kb BAC clone after a single hybridization (Table 1 
and data not shown). 

Next, we multiplexed the BAC enrichment to capture 
sequences from multiple loci in a single hybridization. 
Mononucleosome DNA libraries prepared from each cell 
type were hybridized with mixtures of up to nine different 
BAC clones to simultaneously capture nucleosomes 
covering ~ 1.9 Mb of the genome. We sequenced the re- 
sulting libraries with standard methods and confirmed 
that each target locus was specifically enriched after hy- 
bridization with multiplexed BACs. The total enrichment 
of all regions covered by the nine different BACs in the 
multiplex was 810-fold, and enrichment of the specific 
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Figure 1. Homogeneous populations of T cells were used for nuclei preparations, isolation of mononuclesomal DNA and BEM-seq. (A) Flow 
cytometry was used to confirm the purity of naive CD4 and CD8 T cells after isolation. The percentage of CD4+ and CD8+ T cells of all events is 
shown. (B) Schematic shows the developmental relationship of CD4 and CD8 T cells, and it depicts their differentiation into classical Th cell and 
CTL subsets and their characteristic pattern of transcription of the Ifng, 114 and Prfl genes (green arrows indicate transcriptionally active genes). (C) 
Nuclei isolated from each different cell type were incubated with MNase for increasing time (wedges above blots) before the DNA was extracted and 
resolved on 3.3% low-melt agarose gels in lx TBE. Mononucleosomal DNA (~147bp) was excised from the gel, and the DNA was extracted to 
generate mononucleosomal DNA libraries. Note that only DNA from time points in which there was limited evidence of intranucleosome cleavage 
were selected for library preparation. (D) Schematic of BEM-seq method shows the work-flow. (E) Alignment of Sanger sequenced clones to the Prfl 
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Table 2. Quantification of BAC-based target enrichment of mononucleosomal DNA using SOLiD paired-end sequencing 



Cell type 


Barcode 


Number of 


Number of 


Number of 


On-target 


Number of 




sequence" 


unique tags 


non-unique tags 


unique on-target tags 


tags (%) 


BAC clones used 














for enrichment b 


CTL 


None 


5191686 


471 033 


4180213 


81 




CTL BR 


CGCTAG 


4 003 477 


466 044 


2 053 070 


51 


9 


Thl 


None 


5257 325 


420894 


4233 849 


81 


7 (4) 


Thl BR 


TGTGGG 


4227774 


448 603 


2970319 


70 


9 


Th2 


None 


1 197 925 


85 950 


829 754 


69 


7 ,JI 


Th2 BR 


GAATGT 


3 782 956 


391 195 


2 938927 


78 


9 



"Nucleosome DNA from biological replicas (BR) was ligated to barcoded PI and original P2 adaptors. We added 6 bp of unique sequence (barcode) 
to the 3'-end of the SOLiD PI adaptor as follows: CCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGATNNNNNN (N being 
barcode sequence). Barcoding enabled sequencing mixtures of multiple libraries from different conditions simultaneously on one sector of an 
octet slide in a single flow cell, to significantly reduce sequencing costs. 

b Superscript numbers indicate that several independent enrichments with different BACs were performed before the enriched libraries were combined 
and subjected to parallel sequencing. For example, 7 <3) indicates that 3 independent hybridizations that collectively included a total of seven different 
BACs were used for enrichment, and then the enriched DNA from each hybridization was pooled for sequencing. 



region covered by each individual BAC was on average 
> 7000-fold, similar to when a single BAC was used during 
an independent enrichment (Table 1). These results were 
based on nuclei preparations from eight different cell types 
and two biological replicates (Table 2 and data not 
shown). Together, these data show that nucleosomal 
DNA from distinct loci covering megabase regions can 
be enriched simultaneously, efficiently and reproducibly. 

BEM-seq maps genuine nucleosomes with very 
high precision 

To generate nucleosome maps, we sequenced each 
enriched mononucleosome DNA library independently 
using both single-end and paired-end SOLiD sequencing 
chemistries. Alignment of single-end (35-bp tags) or 
paired-end (35- and 25-bp tags) reads to the murine 
genome (mm9) confirmed that enrichment was consistent 
with results from standard sequencing methods (Table 2). 
The aligned reads were used to calculate the center- 
weighted nucleosome occupancy score at each position 
of enriched regions (see 'Materials and Methods' section 
for details). Reads from paired-end sequencing were 
mapped more efficiently than single-end reads, and more 
importantly, paired-end sequencing resulted in better 
mapping accuracy of the nucleosome centers than 
single-end sequencing because both ends of the protected 
DNA were known (data not shown and see 'Materials and 
Methods' section). Using the paired-end data, we 
examined whether called peaks were likely to be the 
center of nucleosomes by 'center aligning' the ±73-bp 
region at the identified occupancy peak position and 
plotting the resulting frequency of AA/AT/TT/TA di- 
nucleotides (see 'Materials and Methods' section for 
details). A clear 10-bp dinucleotide periodicity was 
evident in the alignment (Figure 2A), a hallmark of se- 
quences associated with nucleosomes in vivo because 
they tolerate sharp bending of DNA around the histone 
octamer (1,19-21). These data show that targeted enrich- 
ment of mononucleosomal DNA by hybridization to BAC 
clones can be used to map individual nucleosomes from 
specific megabase genomic regions of the mammalian 
genome. 



We achieved up to ~ 1000-fold coverage (i.e. uniquely 
mapped reads x 147 bp/total bp in BACs used for selec- 
tion) after multiplexed enrichment and paired-end 
sequencing in some experiments. The very high coverage 
afforded by BEM-seq enabled us to examine how 
sequencing coverage could affect mapping precision 
using a simulation study. Approximately 300000 
uniquely aligned paired-end reads of defined length 
(137-157 bp) (Supplementary Figure SI) were used to 
map nucleosomes in 210 kb of the 114 locus. With the 
full data set (~210-fold coverage), 791 peaks were 
identified on the occupancy curve, and the peak positions 
were treated as putative 'true' nucleosome centers. A fixed 
fraction of reads from the full data set were randomly 
sampled, and the putative nucleosomes were identified 
again in the same way as for the original sample. We 
plotted the frequency of distances between the nucleosome 
positions in the original map and the simulated maps from 
different sampling fractions (Figure 2B). At ~20-fold 
coverage, only about half of the peaks mapped back to 
within ±4 bp of the original nucleosome centers, and the 
distance from the original centers rapidly increased for the 
remaining 50% of peaks. In contrast, at 100-fold 
coverage, >80% of peaks could be mapped to within 
±4 bp of the original nucleosome centers. These data 
show that increased sequencing coverage achieved with 
BEM-seq improves the precision of mapping individual 
nucleosome centers. 

BEM-seq discerns differential nucleosome organization in 
distinct T-cell lineages 

To examine whether nucleosomes mapped by BEM-seq 
could be used to discern biologically relevant changes in 
nucleosomes, we inspected nucleosome organization of the 
transcribed region of Prfl in CTL, Thl and Th2 cells. Prfl 
is strongly transcribed in CTL, whereas it is weakly 
transcribed, or silent, in Thl and Th2 cells (11). 
Surprisingly, the Spearman correlations of the overall nu- 
cleosome occupancy scores throughout the transcribed 
region of Prfl in any pairwise comparison of the three 
different cell types were very high. However, the correl- 
ation between Thl and Th2 cells was higher than the 
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Figure 2. BEM-seq maps genuine nucleosomes and high sequencing coverage improves mapping precision. (A) The frequency of dinucleotides in 
sequences of center-aligned peaks from the 114, Ifng and Prfl loci in Thl cells shows a characteristic 10-bp periodicity typical of sequences bound to 
nucleosomes in vivo. (B) The effect of sequencing coverage on mapping nucleosome centers precisely was determined by comparing nucleosome center 
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correlations between CTL and either Thl or Th2 cells, 
indicating that nucleosomes were distinct in CTL 
relative to either Th cell lineage. This result was true 
whether nucleosome maps were calculated based on 
single-end sequencing or paired-end sequencing in differ- 
ent biological replicates (Table 3). Although overall most 
nucleosomes seemed to be organized similarly in all three 
cell types, discrete nucleosomes exhibited strikingly differ- 
ent occupancy or positioning in CTL relative to both Thl 
and Th2 cells. These nucleosomes were primarily located 
at the transcriptional start site (TSS), upstream of the 
promoter and in the second intron of Prfl (Figure 3A), 
and were located near previously identified DNase I 
hypersensitive (DHS) sites in CTL and Thl cells (11). 
Specifically, several nucleosomes in the promoter and 
— 1-kb region showed lower nucleosome occupancy in 
CTL relative to Thl and Th2 cells. In addition, a 
strongly positioned '+1' nucleosome was located 
immediately downstream of the TSS in CTL, but the oc- 
cupancy of this nucleosome was strikingly lower in both 
Thl and Th2 cells (Figure 3A, lower panel). In addition, 
the center location of the +2 nucleosome in Thl and Th2 
cells was shifted towards the TSS compared with its 
location in CTL. Thus, differential organization of 
specific nucleosomes was correlated with strong transcrip- 
tion of Prfl in CTL and weak or inactive transcription in 
Th cells. 

We also examined the transcribed aspects of the Ifng 
and 114 genes. Analogous to observations in the Prfl 
gene, there was overall similarity of the nucleosome 
pattern in both the 114 and Ifng genes in CTL, Thl and 
Th2 cells, whereas the occupancy and the positioning of 
certain discrete nucleosomes were clearly different 
between the three cell-types. These differentially re- 
modeled nucleosomes were located near to the TSS 



Table 3. Spearman correlation of nucleosomes across Prfl in Thl, 
Th2 and CTL 



Chemistry Condition Thl Th2 CTL Thl Th2 CTL 

BR BR BR 



Single-end Thl 1.0 0.9297 0.6799 0.7640 0.7515 0.6432 

Th2 1.0 0.6870 0.7182 0.7718 0.6185 

CTL 1.0 0.7686 0.7480 0.8516 

Paired-end Thl BR 1.0 0.9336 0.8918 

Th2 BR 1.0 0.8567 

CTL BR 1.0 



Nucleosomes were mapped in two biological replicates (denoted BR) 
using single-end sequencing for the first replicate and paired-end 
sequencing for the second replicate. Note that in both cases, correl- 
ations between Thl and Th2 cells were higher than between CTL 
and either Thl or Th2 cells (bold text). 



region and overlapped known DHS sites (22,23). In the 
Ifng gene, the +1 nucleosome exhibited high occupancy in 
all three cell types, whereas its center location was specif- 
ically repositioned in Thl and Th2 cells relative to CTL 
(Figure 3B, upper panel). Interestingly, the repositioned 
+ 1 nucleosome did not seem to be associated with alter- 
ations in any of the immediately adjacent downstream nu- 
cleosomes (Figure 3B, lower panel). In contrast, at the 114 
locus in Th2 cells, a delocalized nucleosome signal covered 
the region where a +1 nucleosome would presumably be 
located downstream of the TSS, and the immediate down- 
stream nucleosomes were substantially different between 
Thl and Th2 cells (Supplementary Figure S2). Taken 
together, these data demonstrate that BEM-seq resolves 
differential changes in individual nucleosomes near the 
transcribed aspects of differentially transcribed genes 
during effector T-cell differentiation. 
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Figure 3. Nucleosome organization around TSSs is cell type specific. (A) BEM-seq identifies changes in specific nucleosomes in Prfl. Nucleosome 
tracks across lOkb of Prfl (chrlO: 60758820-60768819, mm9) in CTL, Thl and Th2 cells and the position of DNase I hypersensitive sites (numbered 
arrows) are shown. Nucleosomes in the +1, +2 and +3 positions are denoted. Note, depletion of nucleosomes at the TSS and — 1-kb region, a 
strongly positioned +1 nucleosome immediately downstream of the TSS and remodeled nucleosome positions near DHS 7 in CTL relative to Thl 
and Th2 cells. The red bars highlight nucleosomes that are altered between at least two cell types. (B) Nucleosome tracks and the position of DNase I 
hypersensitive sites (numbered arrows) across 10 kb of IJng (chrlO: 117875525-117885524) in CTL, Thl and Th2 cells are shown. Note the down- 
stream repositioning of the +1 nucleosome in Thl and Th2 cells relative to CTL, but a lack of attendant remodeling of nucleosomes in the +2 and +3 
positions. 



DISCUSSION 

To map nucleosomes at high-resolution in large genomes 
and focus on relevant loci, we developed an approach 
called BEM-seq, which is based on enriching mono- 
nucleosome DNA derived from specific regions by hybrid- 
ization with BACs, and subjecting the enriched DNA to 
parallel sequencing. We achieved enrichments of up to 



10000-fold for a given locus (~200kb) after a single hy- 
bridization, and we required <18 cycles of PCR to amplify 
the libraries post-enrichment. To reach comparable en- 
richment, a previously reported method for direct 
genomic selection with BACs required two sequential 
rounds of selection and up to 60 cycles of PCR (10). 
Thus, the BEM-seq approach enriches targets more effi- 
ciently; therefore, enriched libraries are likely to be 
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influenced less by bias resulting from extensive PCR 
compared with previous methods. Solution hybridization 
with synthetic oligonucleotides has also been used for 
target-enrichment and sequencing, and it has noted 
skewing of coverage based on nucleotide content (24). 
The skewing might have been because of inefficient hy- 
bridization of particular sequences, but it might also be 
ascribed to artifacts in capture sequences arising during 
oligonucleotide synthesis, combined with known biases 
inherent to preparation and sequencing of next-generation 
libraries (15). In any case, the BEM-seq approach is likely 
to be subject to similar biases during hybridization and 
library preparation, but result in a faithful or better rep- 
resentation of original sequences because of the fidelity of 
BAC-derived sequences relative to synthesized DNA. 
Synthetic oligonucleotides have been used to enrich 
megabase regions of the genome by hybridization, and 
we also demonstrated that BEM-seq enriches many loci 
simultaneously, by 'multiplexing' several different BACs 
in the same hybridization. This enabled high-resolution 
mapping of nucleosomes across ~ 1.9 Mb in eight different 
cell types in a single experiment. Therefore, the BEM-seq 
method is scalable similar to using synthetic oligonucleo- 
tides, yet is much less expensive and could conceivably be 
used to interrogate most or all genes that are differentially 
expressed in cells exposed to different activation stimuli or 
during a particular differentiation process. Thus, 
BEM-seq is a reliable and economical method for enrich- 
ing mononuclesome DNA from targeted, yet expansive 
regions of genomes to sequence the associated nucleo- 
somes at very high coverage. 

The increased nucleosome sequencing coverage 
afforded by BEM-seq improved the precision of 
mapping nucleosomes based on the reads data, and this 
is likely to have translated into knowing their positions 
in vivo more accurately. The apparent positions of nucleo- 
somes mapped by MNase digestion may differ from their 
actual positions in vivo, especially when sequencing 
coverage is low, because MNase prefers to cleave particu- 
lar sequences (25), and there is dynamic behavior of nu- 
cleosomes (18) leading to random variation in read lengths 
and center positions of DNA protected from MNase 
cleavage. We found that paired-end sequencing and 
100-fold coverage mapped a much larger proportion of 
nucleosomes precisely than when using ~20-fold 
coverage (which is roughly comparable with current 
whole-genome single-end sequencing of mononucleosome 
DNA in humans). We found that at ~20-fold coverage, 
the location of a large fraction of nucleosome peaks 
deviated substantially from their likely 'true' centers. We 
presume those nucleosomes whose center locations did not 
deviate drastically at low coverage likely comprised the 
minority of nucleosomes that were 'well positioned', con- 
sistent with findings from whole-genome sequencing of 
mononucleosome DNA from human T cells (6). Beyond 
mapping nucleosomes that are 'well positioned', our data 
indicate that at least 100-fold coverage of mammalian 
genomes with paired-end sequencing may be necessary 
to map most nucleosomes precisely. Finally, given that 
recent studies indicate MNase-derived bias minimally 
affects discerning the actual positions of nucleosomes 



in vivo (21,26), the increased precision afforded by 
BEM-seq likely improves mapping nucleosomes with 
respect to their real positions in vivo. 

Nucleosome organization assayed by whole-genome 
sequencing of mononucleosome DNA is typically 
visualized in aggregate plots as patterns of nucleosome 
occupancy surrounding aligned genomic landmarks, 
such as TSSs. The average pattern of nucleosomes at the 
TSS of active genes consists of a nucleosome-depleted 
region upstream of the TSS, a strongly positioned '+1' 
nucleosome immediately downstream of the TSS and 
phasing of sequential downstream nucleosomes; in genes 
that are not expressed highly, this pattern is less amplified 
(6,9). Our high-resolution mapping of nucleosomes in the 
Prfl, Ifng and 114 genes with BEM-seq indicated that nu- 
cleosome organization around TSSs is gene specific and 
much more complex than as viewed in the average pattern 
of many genes. We noted several important distinctions. 
First, both transcriptionally active and inactive genes ex- 
hibited a nucleosome-depleted region upstream of the 
TSS, suggesting that this configuration is not programmed 
by transcriptional competence itself. Second, although 
transcriptional changes correlated with changes in organ- 
ization of the '+1' nucleosome, in certain cases, occupancy 
of the +1 nucleosome was altered (see the Prfl TSS), 
whereas in others, the location of +1 nucleosome was 
changed, or it was not strongly positioned in the first 
place (see the 114 and Ifng TSSs). Interestingly, differential 
positioning of the +1 nucleosome did not correspond to 
strong changes in the positioning of adjacent nucleosomes 
as would be predicted if placement of the +1 nucleosome 
enforced statistical positioning and downstream phasing 
of nucleosomes (5). Thus, high-resolution mapping of 
discrete nucleosomes reveals finite architecture that is 
not discernable in the ensemble average. Our data 
suggest that BEM-seq is likely to be a powerful 
approach to examine how DNA sequence, transcription 
factor occupancy and chromatin remodeling enzymes pre- 
cisely place individual nucleosomes in mammalians. 

BEM-seq is particularly suited for comparative 
mapping of the same genomic regions under different con- 
ditions or in different cell types. Analysis of the Prfl locus 
revealed previously unappreciated details of chromatin ac- 
cessibility in c/s-acting domains. We previously found that 
the DNase I hypersensitivity pattern of the Prfl promoter 
and immediate 5' flank in CTL and Thl cells did not 
explain the 20-fold higher expression of Prfl in CTL 
compared with Th cells; both cell-types exhibited the 
same DNase I hypersensitivity pattern (11). Here, using 
BEM-seq, however, we detected obvious and reproducible 
cell type-specific differences in nucleosome organization of 
these regions in CTL relative to Th cells. Notably, deple- 
tion of nucleosomes in CTL relative to Th cells correlated 
with high-RNA Polymerase II occupancy at the Prfl TSS, 
and high occupancy of Stat5 and eomesodermin transcrip- 
tion factors at the known Stat5 dependent enhancer at 
— 1 kb, suggesting that these factors may be responsible 
for the altered nucleosome pattern (12,27). We found 
analogous changes in the Ifng and 114 loci. Therefore, 
results based on mapping nucleosomes with BEM-seq 
provide distinct and specific regulatory information 
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compared with other measures of chromatin accessibility, 
such as DNase I hypersensitivity. 

In summary, BEM-seq is likely a broadly applicable 
approach for probing many aspects of the chromatin 
landscape within relevant regions of large and complex 
genomes. In addition to mapping nucleosomes, combining 
BAC-based enrichment with next-generation sequencing is 
likely to be ideal for quantifying the occupancy and 
location of other DNA bound factors, the apposition of 
distal ris-elements and the location of non-coding transcripts 
that may be unrepresented or invisible in the ensemble 
average of whole-genome measurements. Thus, applying 
the BEM-seq approach together with ChlP-seq and chroma- 
tin conformation assays, and RNA-seq is likely useful for 
resolving, in unprecedented detail, the interactions that 
underlie chromatin dynamics and transcription. 

SUPLEMENTARY DATA 

Supplementary Data are available at NAR Online: 
Supplementary Figures 1 and 2. 
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